library(tidyverse)
library(vegan)

Disclaimer: Much of this material was taken from a variety of statistical reference guides and websites (see list of sources below).

Our roadmap so far

  1. collected multivariate community data
data(dune)
data(dune.env)

head(dune)
##   Achimill Agrostol Airaprae Alopgeni Anthodor Bellpere Bromhord Chenalbu
## 1        1        0        0        0        0        0        0        0
## 2        3        0        0        2        0        3        4        0
## 3        0        4        0        7        0        2        0        0
## 4        0        8        0        2        0        2        3        0
## 5        2        0        0        0        4        2        2        0
## 6        2        0        0        0        3        0        0        0
##   Cirsarve Comapalu Eleopalu Elymrepe Empenigr Hyporadi Juncarti Juncbufo
## 1        0        0        0        4        0        0        0        0
## 2        0        0        0        4        0        0        0        0
## 3        0        0        0        4        0        0        0        0
## 4        2        0        0        4        0        0        0        0
## 5        0        0        0        4        0        0        0        0
## 6        0        0        0        0        0        0        0        0
##   Lolipere Planlanc Poaprat Poatriv Ranuflam Rumeacet Sagiproc Salirepe
## 1        7        0       4       2        0        0        0        0
## 2        5        0       4       7        0        0        0        0
## 3        6        0       5       6        0        0        0        0
## 4        5        0       4       5        0        0        5        0
## 5        2        5       2       6        0        5        0        0
## 6        6        5       3       4        0        6        0        0
##   Scorautu Trifprat Trifrepe Vicilath Bracruta Callcusp
## 1        0        0        0        0        0        0
## 2        5        0        5        0        0        0
## 3        2        0        2        0        2        0
## 4        2        0        1        0        2        0
## 5        3        2        2        0        2        0
## 6        3        5        5        0        6        0
head(dune.env)
##    A1 Moisture Management      Use Manure
## 1 2.8        1         SF Haypastu      4
## 2 3.5        1         BF Haypastu      2
## 3 4.3        2         SF Haypastu      4
## 4 4.2        2         SF Haypastu      4
## 5 6.3        1         HF Hayfield      2
## 6 4.3        1         HF Haypastu      2
  1. calculated distance matrix
dune_bc <- vegdist(dune, method = "bray")

dune_bc 
##            1         2         3         4         5         6         7
## 2  0.4666667                                                            
## 3  0.4482759 0.3414634                                                  
## 4  0.5238095 0.3563218 0.2705882                                        
## 5  0.6393443 0.4117647 0.4698795 0.5000000                              
## 6  0.6363636 0.5111111 0.5681818 0.6344086 0.2967033                    
## 7  0.5517241 0.4390244 0.4750000 0.5058824 0.2289157 0.2272727          
## 8  0.6551724 0.5365854 0.3250000 0.4117647 0.6385542 0.5909091 0.5250000
## 9  0.6000000 0.4761905 0.3414634 0.3793103 0.5058824 0.6000000 0.4878049
## 10 0.5737705 0.2941176 0.4698795 0.4772727 0.3488372 0.3186813 0.2771084
## 11 0.5600000 0.5405405 0.5555556 0.5844156 0.6266667 0.4500000 0.4444444
## 12 0.9245283 0.7142857 0.4400000 0.5250000 0.6923077 0.6385542 0.6266667
## 13 0.8431373 0.6000000 0.4246575 0.5128205 0.6842105 0.7530864 0.6438356
## 14 1.0000000 0.7878788 0.7500000 0.7971014 0.8805970 0.8055556 0.8750000
## 15 1.0000000 0.9076923 0.7142857 0.7352941 0.8484848 0.8028169 0.8412698
## 16 0.9215686 0.8933333 0.6712329 0.6666667 0.8947368 0.8518519 0.8904110
## 17 0.8787879 0.8245614 0.8909091 0.9000000 0.6206897 0.6825397 0.6727273
## 18 0.7777778 0.5942029 0.6119403 0.6666667 0.5428571 0.4933333 0.5522388
## 19 1.0000000 0.8082192 0.8309859 0.7894737 0.7027027 0.7215190 0.7464789
## 20 1.0000000 0.9452055 0.7746479 0.7631579 0.8918919 0.8481013 0.8873239
##            8         9        10        11        12        13        14
## 2                                                                       
## 3                                                                       
## 4                                                                       
## 5                                                                       
## 6                                                                       
## 7                                                                       
## 8                                                                       
## 9  0.3170732                                                            
## 10 0.5421687 0.6000000                                                  
## 11 0.5277778 0.5945946 0.4133333                                        
## 12 0.4400000 0.3506494 0.7179487 0.6716418                              
## 13 0.3698630 0.4133333 0.7368421 0.7538462 0.3529412                    
## 14 0.5625000 0.7575758 0.7611940 0.8214286 0.6949153 0.6491228          
## 15 0.4285714 0.6615385 0.8484848 0.7454545 0.6206897 0.6785714 0.3617021
## 16 0.4246575 0.6533333 0.8947368 0.8769231 0.5882353 0.6060606 0.5438596
## 17 0.8909091 0.8947368 0.6206897 0.7021277 0.9200000 0.8750000 0.8974359
## 18 0.6417910 0.6811594 0.4857143 0.3220339 0.7419355 0.8000000 0.8431373
## 19 0.7464789 0.7808219 0.7027027 0.5555556 0.6969697 0.8125000 0.8545455
## 20 0.4929577 0.6986301 0.8918919 0.8095238 0.6969697 0.7187500 0.4545455
##           15        16        17        18        19
## 2                                                   
## 3                                                   
## 4                                                   
## 5                                                   
## 6                                                   
## 7                                                   
## 8                                                   
## 9                                                   
## 10                                                  
## 11                                                  
## 12                                                  
## 13                                                  
## 14                                                  
## 15                                                  
## 16 0.3571429                                        
## 17 0.8947368 1.0000000                              
## 18 0.7200000 0.8666667 0.7619048                    
## 19 0.7777778 0.9062500 0.5652174 0.5517241          
## 20 0.2962963 0.3437500 0.9130435 0.6896552 0.7419355
  1. cluster analysis to find groupings in the data

But what if we already knew there were groups or defined structure in our data. For example, let’s say we collected fish in lakes with different levels of fishing or invertebrate communities in three different habitat types. How do we test whether the communities are different among these groups? Your Science paper awaits…

Testing for differences among groups

Motivation:

Non-parametric tests for statistically significant difference between two or more groups of multivariate data points, based on any distance measure. These tests are used in ecology and biogeography for the comparison of taxonomic composition in two or more groups of samples (Hammer & Harper 2006).

Why non-parametric, you ask? Typical ecological community or environmental data will likely not satisfy the assumptions of a MANOVA (multivariate ANOVA) or MANCOVA. There are often more species (variables) than samples, and the potentially large number of zero values means that the probability distribution of counts is likely not to approximate multivariate normality (Clarke & Warwick 2001). Instead, the techniques we’ll discuss today are distribution free.

Data:

Any multivariate dataset divided into groups of items, but normally species abundance or presence/absence in a number of samples divided into two or more groups. These groups need to be specified prior to seeing or collecting the data (i.e., don’t use cluster analysis to find groups and then PERMANOVA or ANOSIM to test for differences betweeen these groups as this is circular logic). Also note that these tests are sensitive to differences in distances within the different groups (Hammer & Harper 2006).

Coding Toolbox:

vegan tutorial: The recommended method to run these analyses in vegan is adonis which implements a multivariate analysis of variances using distance matrices. The function adonis can handle both continuous and factor predictors. Other methods in vegan include multiresponse permutation procedure (mrpp) and analysis of similarities (anosim). Both of these handle only class predictors, and they are less robust than adonis (Oksanen 2015).

Techniques

Today, we’ll walk through three ways to test for significant differences among groups in multivariate community datasets.

  1. Mantel Test
  2. ANOSIM
  3. PERMANOVA

As a side note, if you’re interested in multi-response permutation procedures (MRPP), you can read about them here in the McCune and Grace 2002 chapter (https://www.umass.edu/landeco/teaching/multivariate/readings/McCune.and.Grace.2002.chapte24.pdf) and in the UMASS lecture slides (https://www.umass.edu/landeco/teaching/multivariate/schedule/discriminate1.pdf).
 

1. Mantel Test

Description:

The Mantel test (Mantel 1967) compares two sets of (dis)similarity or distance matrices and tests for linear or monotonic correlation between corresponding elements in these matrices. Non-redundant portions of these matrices (e.g. the lower or upper triangle of symmetrical matrices, excluding the diagonals) are ‘unfolded’ into column vectors and a correlation calculated between these vectors. The matrices being tested must be calculated from data sets with the same objects, but with variables that are independent of one another (Gusta Me).

Null hypothesis: the distances among objects in a matrix of response variables are not linearly correlated with another matrix of explanatory variables.

A significant Mantel test will tell you that the distances between samples in one matrix are correlated with the distances between samples in the other matrix. Therefore, as the distance between samples increases with respect to one matrix, the distances between the same samples also increases in the other matrix (https://jkzorz.github.io/2019/07/08/mantel-test.html).

Note: The Mantel test should not be used as a general method for the investigation of linear relationships or spatial structures in univariate or multivariate data. Its use should be restricted to tests of hypotheses that can only be formulated in terms of distances (Legendre & Fortin 2010).

McCune and Grace 2002 chapter: https://www.umass.edu/landeco/teaching/multivariate/readings/McCune.and.Grace.2002.chapter27.pdf

Check out https://mb3is.megx.net/gustame/hypothesis-tests/the-mantel-test for more information.

Assumptions & Considerations:

  • Tests for linear relationships between (dis)similarity matrices of the same dimensions and from the same set of sample units
  • Response and explanatory variables must be independent
  • Sensitive to differences in dispersions among groups
  • Less robust than PERMANOVA
  • It is recommended that the Mantel test should be used only for hypotheses framed in terms of (dis)similarity or distance between pairs of objects. If hypotheses references raw data, consider constrained analyses such as redundancy analysis (RDA) or canonical correspondence analysis (CCA). (Gusta Me)
  • Note that the Mantel test, which may be useful for analysing a single gradient, is not recommended for investigating more than one gradient at a time (such as spatial gradients in two dimensions), due to the omnidirectional nature of dissimilarities and lack of power (Anderson et al. 2011).
  • Groups should be defined a priori

Example Uses:

The Mantel test can be used to explore and model the relationship between pairwise dissimilarities in community structure and pairwise differences in space, time or environment. Here, interest lies in modelling all pairwise \(\Delta\)y values as a function of \(\Delta\)x (Anderson et al. 2011). For example, we could ask whether differences in insect community composition are related to differences in precipitation. In this case, we could use the Mantel test to look for correlations between a Bray-Curtis dissimilarity matrix calculated for insect species across sites and a Euclidean distance matrix calculated for environmental parameters at those sites (e.g., precipitation differences between sites). These tests can be used to address whether the environment is “selecting” for the community.

From Anderson et al. 2011 (Figure 4)

R Code:

Run using mantel function in the vegan package:

### How well do lichen pastures (varespec) correspond to the environment (varechem)?
# copied from vegan tutorial

data(varespec) #estimated lichen cover of 44 different species
head(varespec)
##    Callvulg Empenigr Rhodtome Vaccmyrt Vaccviti Pinusylv Descflex Betupube
## 18     0.55    11.13     0.00     0.00    17.80     0.07     0.00        0
## 15     0.67     0.17     0.00     0.35    12.13     0.12     0.00        0
## 24     0.10     1.55     0.00     0.00    13.47     0.25     0.00        0
## 27     0.00    15.13     2.42     5.92    15.97     0.00     3.70        0
## 23     0.00    12.68     0.00     0.00    23.73     0.03     0.00        0
## 19     0.00     8.92     0.00     2.42    10.28     0.12     0.02        0
##    Vacculig Diphcomp Dicrsp Dicrfusc Dicrpoly Hylosple Pleuschr Polypili
## 18     1.60     2.07   0.00     1.62     0.00      0.0     4.67     0.02
## 15     0.00     0.00   0.33    10.92     0.02      0.0    37.75     0.02
## 24     0.00     0.00  23.43     0.00     1.68      0.0    32.92     0.00
## 27     1.12     0.00   0.00     3.63     0.00      6.7    58.07     0.00
## 23     0.00     0.00   0.00     3.42     0.02      0.0    19.42     0.02
## 19     0.00     0.00   0.00     0.32     0.02      0.0    21.03     0.02
##    Polyjuni Polycomm Pohlnuta Ptilcili Barbhatc Cladarbu Cladrang Cladstel
## 18     0.13     0.00     0.13     0.12     0.00    21.73    21.47     3.50
## 15     0.23     0.00     0.03     0.02     0.00    12.05     8.13     0.18
## 24     0.23     0.00     0.32     0.03     0.00     3.58     5.52     0.07
## 27     0.00     0.13     0.02     0.08     0.08     1.42     7.63     2.55
## 23     2.12     0.00     0.17     1.80     0.02     9.08     9.22     0.05
## 19     1.58     0.18     0.07     0.27     0.02     7.23     4.95    22.08
##    Cladunci Cladcocc Cladcorn Cladgrac Cladfimb Cladcris Cladchlo Cladbotr
## 18     0.30     0.18     0.23     0.25     0.25     0.23     0.00     0.00
## 15     2.65     0.13     0.18     0.23     0.25     1.23     0.00     0.00
## 24     8.93     0.00     0.20     0.48     0.00     0.07     0.10     0.02
## 27     0.15     0.00     0.38     0.12     0.10     0.03     0.00     0.02
## 23     0.73     0.08     1.42     0.50     0.17     1.78     0.05     0.05
## 19     0.25     0.10     0.25     0.18     0.10     0.12     0.05     0.02
##    Cladamau Cladsp Cetreric Cetrisla Flavniva Nepharct Stersp Peltapht
## 18     0.08   0.02     0.02     0.00     0.12     0.02   0.62     0.02
## 15     0.00   0.00     0.15     0.03     0.00     0.00   0.85     0.00
## 24     0.00   0.00     0.78     0.12     0.00     0.00   0.03     0.00
## 27     0.00   0.02     0.00     0.00     0.00     0.00   0.00     0.07
## 23     0.00   0.00     0.00     0.00     0.02     0.00   1.58     0.33
## 19     0.00   0.00     0.00     0.00     0.02     0.00   0.28     0.00
##    Icmaeric Cladcerv Claddefo Cladphyl
## 18        0        0     0.25        0
## 15        0        0     1.00        0
## 24        0        0     0.33        0
## 27        0        0     0.15        0
## 23        0        0     1.97        0
## 19        0        0     0.37        0
data(varechem) #soil chemistry at same sites
head(varechem)
##       N    P     K    Ca    Mg    S    Al   Fe    Mn   Zn  Mo Baresoil
## 18 19.8 42.1 139.9 519.4  90.0 32.3  39.0 40.9  58.1  4.5 0.3     43.9
## 15 13.4 39.1 167.3 356.7  70.7 35.2  88.1 39.0  52.4  5.4 0.3     23.6
## 24 20.2 67.7 207.1 973.3 209.1 58.1 138.0 35.4  32.1 16.8 0.8     21.2
## 27 20.6 60.8 233.7 834.0 127.2 40.7  15.4  4.4 132.0 10.7 0.2     18.7
## 23 23.8 54.5 180.6 777.0 125.8 39.5  24.2  3.0  50.1  6.6 0.3     46.0
## 19 22.8 40.9 171.4 691.8 151.4 40.8 104.8 17.6  43.6  9.1 0.4     40.5
##    Humdepth  pH
## 18      2.2 2.7
## 15      2.2 2.8
## 24      2.0 3.0
## 27      2.9 2.8
## 23      3.0 2.7
## 19      3.8 2.7
# We first perform a PCA of environmental variables, and then compute dissimilarities for first principal components. 
# We could use a selection of environmental variables in PCA, or we could use standardized environmental variables directly without pca — tastes vary.
pc <- prcomp(varechem, scale = TRUE) #runs pca
pc <- scores(pc, display = "sites", choices = 1:4) #get site scores from pca ordination
edis <- vegdist(pc, method = "euclidean") #creates dissimilarity matrix from pca scores

#ecological dissimilarity matrix
vare.dis <- vegdist(wisconsin(sqrt(varespec)))

#mantel test
mantel(vare.dis, edis)
## 
## Mantel statistic based on Pearson's product-moment correlation 
## 
## Call:
## mantel(xdis = vare.dis, ydis = edis) 
## 
## Mantel statistic r: 0.3812 
##       Significance: 0.001 
## 
## Upper quantiles of permutations (null model):
##   90%   95% 97.5%   99% 
## 0.148 0.178 0.209 0.251 
## Permutation: free
## Number of permutations: 999

As samples become more dissimilar in terms of soil chemistry, they also become more dissimilar in terms of lichen community composition.

You can also use the Mantel test with a single continuous environmental gradient:

### Do differences in community composition in dune vegetation communities vary with the thickness of the soil A1 horizon?

dune_bc <- vegdist(dune, method = "bray")
dune_A1_dist <- vegdist(dune.env$A1,method="euclidean") #A1 = a numeric vector of thickness of soil A1 horizon

mantel(dune_bc,dune_A1_dist)
## 
## Mantel statistic based on Pearson's product-moment correlation 
## 
## Call:
## mantel(xdis = dune_bc, ydis = dune_A1_dist) 
## 
## Mantel statistic r: 0.2379 
##       Significance: 0.041 
## 
## Upper quantiles of permutations (null model):
##   90%   95% 97.5%   99% 
## 0.182 0.222 0.262 0.308 
## Permutation: free
## Number of permutations: 999

A1 soil distance matrix has a weak but statistically significant relationship with the Bray-Curtis dissimilarity matrix.

Note: vegan function mantel uses permutation tests

See https://jkzorz.github.io/2019/07/08/mantel-test.html for additional code and concepts.
 

2. Analysis of Similarities (ANOSIM)

Description:

Analysis of similarities (ANOSIM) tests the effect of class variables (factors) on a community. ANOSIM has some similarity to an ANOVA-like hypothesis test, however, it is used to evaluate a dissimilarity matrix rather than raw data (Clarke 1993, Gusta Me).

ANOSIM is based on distances between all pairs of samples, computed from a distance index of choice. Only the ranks of the distances are used, making ANOSIM a nonparametric test. ANOSIM works by comparing within-group and across-group distances. If two groups were different in their taxonomic compositions, we would expect the distances within each group to be small relative to the distances across groups. A test statistic R is constructed based on this idea, and its significance estimated by permutating samples across groups (Hammer & Harper 2006).

Null hypothesis: no differences in community composition between two or more groups based on permutation test of among- and within-group similarities; the average of the ranks of within-group distances is greater than or equal to the average of the ranks of between-group distances (Anderson & Walsh 2013).

Under the hood, ANOSIM involves 1) calculating a test statistic, 2) recomputing under permutations, and 3) calculating the significance level.

The ANOSIM statistic compares the mean of ranked dissimilarities between groups to the mean of ranked dissimilarities within groups. An R value close to “1.0” suggests dissimilarity between groups while an R value close to “0” suggests an even distribution of high and low ranks within and between groups. R values below “0” suggest that dissimilarities are greater within groups than between groups. See Clarke and Gorley (2006) for a guide to interpreting ANOSIM R values (Gusta Me).

ANOSIM uses ranked dissimilarities instead of actual distances so it provides a nice complement to an NMDS plot.

Check out https://sites.google.com/site/mb3gustame/hypothesis-tests/anosim for more information.

Assumptions & Considerations:

  • Distribution free; no assumption of multivariate normality
  • Independent samples
  • Ranges and medians (e.g.,dispersions) of ranked dissimilarities within groups are equal, or at least very similar
  • Sensitive to differences in dispersions among groups (group heterogeneities); e.g., you may get a significant results with groups of same mean ranks but different within-group variances
  • Sensitive to differences in correlation structure (shape) among groups and group sizes
  • Can have more variables than samples
  • Less robust than PERMANOVA
  • Groups should be defined a priori
  • Q-mode analysis (looks at associations between objects/sites/groups rather than descriptors)

Example Uses:

Similar to PERMANOVA (discussed below), ANOSIM can be used to test for differences in community structure across different spatial, temporal, or environmental groups.

R Code:

Run using anosim function in the vegan package:

### Impact of management on dune vegetation community?
# copied from vegan tutorial

dune_bc <- vegdist(dune, method = "bray")
dune_ano <- with(dune.env, anosim(dune_bc, Management)) #run anosim
plot(dune_ano) #plot rank information

dune_ano
## 
## Call:
## anosim(x = dune_bc, grouping = Management) 
## Dissimilarity: bray 
## 
## ANOSIM statistic R: 0.2579 
##       Significance: 0.01 
## 
## Permutation: free
## Number of permutations: 999

Dune vegetation community structure differs significantly across management groups.

Side note: mrpp performs a similar function using original dissimilarities rather than ranks.

See https://jkzorz.github.io/2019/06/11/ANOSIM-test.html for additional code and concepts.
 

3. Permutational Multivariate Analysis of Variance (PERMANOVA)

Description:

PERMANOVA involves the geometric partitioning of multivariate variation in the space of a chosen dissimilarity measure according to a given ANOVA design, with p-values obtained using appropriate distribution‐free permutation techniques (Anderson 2017). It is quite similar to ANOSIM and likewise allows a choice of distance measure. Yet, PERMANOVA is based directly on the distance matrix rather than on ranked distances. Like ANOSIM, PERMANOVA uses the permutation of samples to estimate significance, but using a different test statistic called psuedo F. In the special case of the Euclidean distance measure on univariate samples, this F statistic becomes equal to the F statistic of ANOVA (Anderson 2001, Hammer & Harper 2006). PERMANOVA is particularly useful as it can deal with multi-factor experimental or hierarchical sampling designs with interactions (and use non-Euclidean distances). (Anderson 2017)

Null hypothesis: no differences in the centroids among groups, given any differences in within-group dispersions.

The PERMANOVA pseudo F statistic for testing the null hypothesis of no differences in the positions of the group centroids in the space of the chosen dissimilarity measure is given by:

\(F = (SS_A / SS_R) * [(N-g)/(g-1)]\)

where
\(SS_A\) = among-group sum-of-squares

\(SS_R\) = within-group (or residual) sum-of-squares

\(g\) = number of groups

\(N\) = number of sampling units (rows)

Schematic diagram of geometric partitioning for PERMANOVA, shown for g = 3 groups of n = 10 sampling units per group in two-dimensional (bivariate, p = 2) Euclidean space. The total variation in the data cloud (SST) is the sum of two parts: SST = SSA + SSR, where the residual (within-group) sum-of-squares (SSR) is the sum of the squared distances to centroids from individual sampling units (replicates) to their own group centroid (colored dotted lines) and where the among-group sum-of-squares (SSA) is the sum of the squared distances from individual group centroids to the overall centroid (solid black lines). From Anderson 2017 (Figure 1).

Check out https://sites.google.com/site/mb3gustame/hypothesis-tests/manova/npmanova for more information.

Assumptions & Considerations:

  • Semiparametric; has elements of classical partitioning (allowing main effects, interactions, hierarchical structures, etc.) while retaining important robust statistical properties of rank-based nonparametric multivariate methods like ANOSIM (Anderson 2017)
  • Distribution free; no assumption of multivariate normality
  • Exchangeability of permutable units under a true null hypothesis; exchangeable objects are independent and have similar multivariate dispersion
  • Sensitive to differences in dispersion (variance) between groups (we’ll discuss how to test for homogeneity of within-group variances)
  • Robust to heterogeneity for balanced designs but not unbalanced designs
  • Insensitive to differences in correlation structure (shape) among groups
  • Can have more variables than samples
  • Insensitive to many zeroes
  • Note that the dissimilarity measure used can affect the results - moreso than in rank-based tests such as ANOSIM (see Anderson 2017)
  • It is vital that the correct permutational scheme is defined and only exchangeable units are permuted. In nested studies, this would mean restricting permutations to an appropriate subgroup of the data set. At times, exact permutation tests either cannot be done, or are restricted to so few objects, that they are not useful. See Anderson (2001, 2005) for examples of permutational schemes involving complex experimental or sampling designs.
  • Groups should be defined a priori
  • Q-mode analysis (looks at associations between objects/sites/groups rather than descriptors) (Anderson 2001, Anderson 2017, and see slides by A. Ordonez in resources folder)

Example Uses:

PERMANOVA facilitates the analysis of beta diversity (variation in community structure) across multiple spatial or temporal scales, which is quantified directly by such components of variation.

For example, how much of the spatial variation in plant communities is explained by the factors of fire frequency, fencing to prevent grazing, and their interaction?

From Anderson et al. 2011 (Figure 4)

R Code:

Run using adonis or adonis2 functions in the vegan package:

### Does dune vegetation community structure differ with Management classes and soil A1 horizon thickness in the dune meadow data?
#copied from vegan documentation

#default test by terms
adonis2(dune ~ Management*A1, data=dune.env, permutations=999) #LHS can be either a community data matrix or a dissimilarity matrix, e.g., from vegdist or dist. If the LHS is a data matrix, function vegdist will be used to find the dissimilarities. 
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = dune ~ Management * A1, data = dune.env, permutations = 999)
##               Df SumOfSqs      R2      F Pr(>F)   
## Management     3   1.4686 0.34161 3.2629  0.003 **
## A1             1   0.4409 0.10256 2.9387  0.025 * 
## Management:A1  3   0.5892 0.13705 1.3090  0.195   
## Residual      12   1.8004 0.41878                 
## Total         19   4.2990 1.00000                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#vs. with dissimilarity matrix provided
adonis2(dune_bc ~ Management*A1, data=dune.env, permutations=999)
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = dune_bc ~ Management * A1, data = dune.env, permutations = 999)
##               Df SumOfSqs      R2      F Pr(>F)   
## Management     3   1.4686 0.34161 3.2629  0.003 **
## A1             1   0.4409 0.10256 2.9387  0.026 * 
## Management:A1  3   0.5892 0.13705 1.3090  0.213   
## Residual      12   1.8004 0.41878                 
## Total         19   4.2990 1.00000                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#overall test
adonis2(dune ~ Management*A1, data=dune.env, permutations=999, by=NULL)
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = dune ~ Management * A1, data = dune.env, permutations = 999, by = NULL)
##          Df SumOfSqs      R2      F Pr(>F)   
## Model     7   2.4987 0.58122 2.3792  0.002 **
## Residual 12   1.8004 0.41878                 
## Total    19   4.2990 1.00000                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

What’s the difference between adonis and adonis2? Check out the documentation (https://www.rdocumentation.org/packages/vegan/versions/2.4-2/topics/adonis) and this stackexchange response (https://stats.stackexchange.com/questions/476256/adonis-vs-adonis2).

### adonis vs. adonis2 example
adonis(dune ~ Management+A1, data=dune.env, permutations=999)
## 
## Call:
## adonis(formula = dune ~ Management + A1, data = dune.env, permutations = 999) 
## 
## Permutation: free
## Number of permutations: 999
## 
## Terms added sequentially (first to last)
## 
##            Df SumsOfSqs MeanSqs F.Model      R2 Pr(>F)   
## Management  3    1.4686 0.48953  3.0730 0.34161  0.004 **
## A1          1    0.4409 0.44089  2.7676 0.10256  0.025 * 
## Residuals  15    2.3895 0.15930         0.55583          
## Total      19    4.2990                 1.00000          
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
adonis(dune ~ A1+Management, data=dune.env, permutations=999)
## 
## Call:
## adonis(formula = dune ~ A1 + Management, data = dune.env, permutations = 999) 
## 
## Permutation: free
## Number of permutations: 999
## 
## Terms added sequentially (first to last)
## 
##            Df SumsOfSqs MeanSqs F.Model      R2 Pr(>F)   
## A1          1    0.7230 0.72295  4.5382 0.16817  0.003 **
## Management  3    1.1865 0.39551  2.4828 0.27600  0.007 **
## Residuals  15    2.3895 0.15930         0.55583          
## Total      19    4.2990                 1.00000          
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
adonis2(dune ~ Management+A1, data=dune.env, permutations=999)
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = dune ~ Management + A1, data = dune.env, permutations = 999)
##            Df SumOfSqs      R2      F Pr(>F)   
## Management  3   1.4686 0.34161 3.0730  0.003 **
## A1          1   0.4409 0.10256 2.7676  0.023 * 
## Residual   15   2.3895 0.55583                 
## Total      19   4.2990 1.00000                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Warning: Nested or hierarchical designs require an appropriate permutational scheme, carefully understanding which objects are truly exchangeable under the null hypothesis. Most importantly, the analyst must define “strata” within which to restrict permutations (Gusta Me).

vegan documentation: If the experimental design has nestedness, then use strata to test hypotheses. For instance, imagine we are testing whether a plant community is influenced by nitrate amendments, and we have two replicate plots at each of two levels of nitrate (0, 10 ppm). We have replicated the experiment in three fields with (perhaps) different average productivity. In this design, we would need to specify strata = field so that randomizations occur only within each field and not across all fields. (https://rdrr.io/rforge/vegan/man/adonis.html)

### Example of use with strata, for nested (e.g. block) designs
#copied from vegan documentation

dat <- expand.grid(rep=gl(2,1), NO3=factor(c(0,10)),field=gl(3,1) )
dat
##    rep NO3 field
## 1    1   0     1
## 2    2   0     1
## 3    1  10     1
## 4    2  10     1
## 5    1   0     2
## 6    2   0     2
## 7    1  10     2
## 8    2  10     2
## 9    1   0     3
## 10   2   0     3
## 11   1  10     3
## 12   2  10     3
Agropyron <- with(dat, as.numeric(field) + as.numeric(NO3)+2) +rnorm(12)/2
Schizachyrium <- with(dat, as.numeric(field) - as.numeric(NO3)+2) +rnorm(12)/2
total <- Agropyron + Schizachyrium
dotplot(total ~ NO3, dat, jitter.x=TRUE, groups=field,
        type=c('p','a'), xlab="NO3", auto.key=list(columns=3, lines=TRUE) )

Y <- data.frame(Agropyron, Schizachyrium)
mod <- metaMDS(Y, trace = FALSE)
plot(mod)

### Ellipsoid hulls show treatment
with(dat, ordiellipse(mod, field, kind = "ehull", label = TRUE))

### Spider shows fields
with(dat, ordispider(mod, field, lty=3, col="red"))

### Incorrect (no strata)
perm <- how(nperm = 999) #how defines permuation design
adonis2 (Y ~ NO3, data = dat, permutations = perm)
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = Y ~ NO3, data = dat, permutations = perm)
##          Df SumOfSqs    R2      F Pr(>F)
## NO3       1 0.026507 0.178 2.1655  0.135
## Residual 10 0.122408 0.822              
## Total    11 0.148915 1.000
adonis2 (Y ~ NO3, data = dat, permutations = 999) #alt version
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
## 
## adonis2(formula = Y ~ NO3, data = dat, permutations = 999)
##          Df SumOfSqs    R2      F Pr(>F)
## NO3       1 0.026507 0.178 2.1655  0.168
## Residual 10 0.122408 0.822              
## Total    11 0.148915 1.000
## Correct with strata
setBlocks(perm) <- with(dat, field)
adonis(Y ~ NO3, data = dat, permutations = perm)
## 
## Call:
## adonis(formula = Y ~ NO3, data = dat, permutations = perm) 
## 
## Blocks:  with(dat, field) 
## Permutation: free
## Number of permutations: 999
## 
## Terms added sequentially (first to last)
## 
##           Df SumsOfSqs  MeanSqs F.Model    R2 Pr(>F)  
## NO3        1  0.026507 0.026507  2.1655 0.178  0.018 *
## Residuals 10  0.122408 0.012241         0.822         
## Total     11  0.148915                  1.000         
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
adonis(Y ~ NO3, data=dat, strata=dat$field, permutations=999) #alt version specifying groups (strata) within which to constrain permutations, https://stats.stackexchange.com/questions/350462/can-you-perform-a-permanova-analysis-on-nested-data
## 
## Call:
## adonis(formula = Y ~ NO3, data = dat, permutations = 999, strata = dat$field) 
## 
## Blocks:  strata 
## Permutation: free
## Number of permutations: 999
## 
## Terms added sequentially (first to last)
## 
##           Df SumsOfSqs  MeanSqs F.Model    R2 Pr(>F)  
## NO3        1  0.026507 0.026507  2.1655 0.178  0.032 *
## Residuals 10  0.122408 0.012241         0.822         
## Total     11  0.148915                  1.000         
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

OK, so maybe you got a significant result, but not so fast…

A test for homogeneity of multivariate dispersions (PERMDISP) in the space of the chosen dissimilarity measure can be done, either to accompany PERMANOVA or in its own right. This test compares within-group spread among groups using the average value of the distances from individual observations to their own group centroid. The specific directions of distances are not taken into account, so PERMDISP, like PERMANOVA, does not identify differences in the shapes of data clouds among groups, only their relative spread (Anderson 2017).

vegan documentation: Function betadisper is a sister function to adonis to study the differences in dispersion within the same geometric framework. Function adonis studies the differences in the group means, but function betadisper studies the differences in group homogeneities. Function adonis is analogous to multivariate analysis of variance, and betadisper is analogous to Levene’s test of the equality of variances.

### Testing for multivariate homogeneity of groups dispersions
# copied from vegan tutorial and documation

# The function can only use one factor as an independent variable, and it does not know the formula interface, so that we need to attach the data frame or use with to make the factor visible to the function:
mod <- with(dune.env, betadisper(dune_bc, Management)) #note that the first input should be a distance matrix

plot(mod) #visualize

boxplot(mod)

anova(mod) #test for significance; are dispersions similar?
## Analysis of Variance Table
## 
## Response: Distances
##           Df  Sum Sq  Mean Sq F value Pr(>F)
## Groups     3 0.13831 0.046104  1.9506 0.1622
## Residuals 16 0.37816 0.023635
TukeyHSD(mod) #you can also analyse pairwise differences between groups
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = distances ~ group, data = df)
## 
## $group
##              diff         lwr       upr     p adj
## HF-BF  0.05956576 -0.26165222 0.3807838 0.9503733
## NM-BF  0.23138915 -0.07962883 0.5424071 0.1862397
## SF-BF  0.14438101 -0.16663697 0.4553990 0.5593144
## NM-HF  0.17182339 -0.09451650 0.4381633 0.2892597
## SF-HF  0.08481525 -0.18152464 0.3511551 0.7992846
## SF-NM -0.08700814 -0.34095326 0.1669370 0.7624415

Testing multiple grouping factors?, see here: https://stat.ethz.ch/pipermail/r-sig-ecology/2010-September/001524.html

And see https://chrischizinski.github.io/rstats/adonis/ for a worked example.  

Comparison between techniques

Check out Anderson & Walsh 2013, which compared PERMANOVA, ANOSIM, and Mantel test results:

These methods all construct ANOVA-like test statistics from a matrix of resemblances (distances, dissimilarities, or similarities) calculated among the sample units, and obtain P values using random permutations of observations among the groups, thereby assuming only exchangeability for the one-way case.

For simulations based on real ecological data sets, PERMANOVA was generally, but not always, more powerful than the others to detect changes in community structure, and the Mantel test was usually more powerful than ANOSIM. Both the error distributions and the resemblance measure affected results concerning power.

Differences in the underlying construction of these test statistics result in important differences in the nature of the null hypothesis they are testing, their sensitivity to heterogeneity, and their power to detect important changes in ecological communities. For balanced designs, PERMANOVA and PERMDISP can be used to rigorously identify location vs. dispersion effects, respectively, in the space of the chosen resemblance measure. ANOSIM and the Mantel test can be used as more ‘‘omnibus’’ tests, being sensitive to differences in location, dispersion or correlation structure among groups. Unfortunately, none of the tests (PERMANOVA, Mantel, or ANOSIM) behaved reliably for unbalanced designs in the face of heterogeneity.

For balanced designs, PERMANOVA was quite robust to heterogeneity, but ANOSIM and the Mantel test were not. These resemblance-based tests are clearly not testing the same null hypothesis. ANOSIM and the Mantel test examine the more general H0: ‘‘samples in the same group are no more tightly clustered together than samples from different groups,’’ whereas PERMANOVA focuses on the more specific H0: ‘‘there are no differences in centroids among the groups.’’ Note that in all cases, the ‘‘clumping of samples within groups’’ or the ‘‘differences in centroids’’ (a shift in the location of the multivariate data cloud) are defined in the space of the resemblance measure chosen for the analysis. As ANOSIM and the Mantel test are more general ‘‘omnibus’’ tests, rejection of the null hypothesis in either case will indicate only that some feature of the groups differ to make them distinct. This feature could be (1) locations, (2) dispersions, (3) the particular shape (correlation structure) of the data clouds being compared; or indeed, some combination of these things.

Importantly, none of the tests examined here were robust to heterogeneity for unbalanced sampling designs, being either excessively liberal or extremely conservative under different scenarios, especially ANOSIM and the Mantel test, which became worse with increasing total numbers of samples.

References and additional resources

Anderson 2001. A new method for non‐parametric multivariate analysis of variance. https://onlinelibrary.wiley.com/doi/10.1111/j.1442-9993.2001.01070.pp.x

Anderson 2005. PERMANOVA: a FORTRAN computer program for permutational multivariate analysis of variance. http://www.primer-e.com/permanova.htm

Anderson et al. 2011. Navigating the multiple meanings of \(\beta\) diversity: A roadmap for the practicing ecologist. https://onlinelibrary.wiley.com/doi/full/10.1111/j.1461-0248.2010.01552.x

Anderson 2017. Permutational Multivariate Analysis of Variance (PERMANOVA). https://onlinelibrary.wiley.com/doi/full/10.1002/9781118445112.stat07841

Anderson & Walsh 2013. PERMANOVA, ANOSIM, and the Mantel test in the face of heterogeneous dispersions: What null hypothesis are you testing? https://esajournals.onlinelibrary.wiley.com/doi/full/10.1890/12-2010.1?casa_token=w0gzukQjxfIAAAAA%3AKKnnt0rDll8ImR-yXOnvF4cXd6s8ZhMr217YHLZWoLwPXwJ90vHBa3lnbSudu_Z5fgf-xDHGagHCYYiB4Q

Clarke 1993. Nonparametric multivariate analyses of changes in community structure. https://onlinelibrary.wiley.com/doi/10.1111/j.1442-9993.1993.tb00438.x

Clarke & Gorley 2006. PRIMER v6: User manual/tutorial.

Clarke & Warwick 2001. Statistical analysis of marine communities. https://www.researchgate.net/publication/221962838_Clarke_KR_Warwick_RMChange_in_Marine_Communities_An_Approach_to_Statistical_Analysis_and_Interpretation_Primer-E_Ltd_Plymouth_UK

Hammer & Harper 2006. Paleontological data analysis. https://onlinelibrary.wiley.com/doi/book/10.1002/9780470750711

Legendre et al. 2005 Analyzing beta diversity: Partitioning the spatial variation of community composition data. https://esajournals.onlinelibrary.wiley.com/doi/full/10.1890/05-0549

Legendre & Fortin 2010. Comparison of the Mantel test and alternative approaches for detecting complex multivariate relationships in the spatial analysis of genetic data. https://onlinelibrary.wiley.com/doi/full/10.1111/j.1755-0998.2010.02866.x?casa_token=vAfWC0ereLUAAAAA%3AkWqErNgPZvmTuKWDlC7fX1a9K3Zw6nBQah1Itk9ZptEYPYTrGNGDaDzjV5JSz8Pu5ODTxokWxzAYihxcxA

Mantel 1967.The detection of disease clustering and a generalized regression approach. https://cancerres.aacrjournals.org/content/27/2_Part_1/209.short

McCune and Grace 2002. Analysis of ecological communities.

Oksanen 2005. Multivariate analysis of ecological communities in R.

Oksanen 2015. Multivariate analysis of ecological communities in R: vegan tutorial. https://www.researchgate.net/publication/260136364_Multivariate_Analysis_of_Ecological_Communities_in_R_Vegan_Tutorial

UMASS lecture and resources (week 4): https://www.umass.edu/landeco/teaching/multivariate/schedule/multivariate_schedule.html

UMASS summary slides: http://www.umass.edu/landeco/teaching/multivariate/schedule/summary.handouts.pdf

Warton et al. 2012. Distance-based multivariate analyses confound location and dispersion effects. https://besjournals.onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00127.x